Resource-efficient fault and intrusion tolerance

نویسنده

  • Tobias Distler
چکیده

More and more network-based services are considered essential by their operators: either because their unavailability might directly lead to economic losses, as with e-commerce applications or online auction services, for example, or because their well-functioning is crucial for the well-functioning of other services, which is, for example, the case for distributed file systems or coordination services. Byzantine fault-tolerant replication allows systems to be built that are able to ensure the availability and reliability of network-based services, even if a subset of replicas fail arbitrarily. As a consequence, such systems not only tolerate fault scenarios in which replicas crash, but also cases in which replicas have been taken over by an adversary as the result of a successful intrusion. Despite the fact that several major outages of network-based services in the past have been caused by non-crash failures, industry is still reluctant to broadly exploit the available research results on Byzantine fault tolerance. One of the main reasons for the decision to retain crash-tolerant systems is the high resource demand associated with Byzantine fault-tolerant systems: Besides the need to execute more costly protocols, the more complex fault model also requires Byzantine fault-tolerant systems to comprise more replicas than their crash-tolerant counterparts. In this thesis, we propose and evaluate different protocols and techniques to increase the resource efficiency of Byzantine fault-tolerant systems. The key insights that serve as a basis for all of these approaches are that during normal-case operation it is sufficient for a system to detect (or at least suspect) faults, while during fault handling a system must be able to actually tolerate faults, and that the former usually requires less resources than the latter. Utilizing these insights, we investigate different ways to improve resource efficiency by implementing a clear separation between normal-case operation and fault handling based on two modes of operation: During normal-case operation, a system reduces its resource usage to a level at which it is only able to ensure progress as long as all replicas behave according to specification. In contrast, in case of suspected or detected faults, the system switches to an operation mode in which it may use additional resources in order to tolerate faults. An important outcome of this thesis is that passive replication can be an effective building block for the implementation of a resource-efficient operation mode for normal-case operation in Byzantine fault-tolerant systems. Furthermore, experimental results show that improving the resource efficiency of a system can also lead to an increase in performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Practical Intrusion-tolerance in the Cloud

Byzantine fault tolerant (BFT) replication is commonly associated with the overhead of 3f +1 replicas to handle f faults. We believe this large resource demand is one of the key reasons why BFT replication is not commonly applied. We present Spare, an approach that harnesses virtualization support as typically found in cloud-computing environments to reduce the resource demand of BFT replicatio...

متن کامل

Intrusion - Tolerant Parsimonious State Machine Replication ∗

We describe a Byzantine-fault-tolerant state machine replication algorithm that reduces computation and communication costs in the fault-free case, and is reasonably efficient even in the presence of faults. Such an algorithm is practically significant, because failures are the exception than the norm, and much of a system’s runtime is fault-free. The algorithm is geared towards applications th...

متن کامل

Security Models for Wireless Sensor Networks

Wireless Sensor Networks (WSNs) are a new technology foreseen to be used increasingly in the near future due to their data acquisition and data processing abilities. Security for WSNs is an area that needs to be considered in order to protect the functionality of these networks, the data they convey and the location of their members. The security models & protocols used in wired and other netwo...

متن کامل

Sensitive Data Protection Based on Intrusion Tolerance in Cloud Computing

Service integration and supply on-demand coming from cloud computing can significantly improve the utilization of computing resources and reduce power consumption of per service, and effectively avoid the error of computing resources. However, cloud computing is still facing the problem of intrusion tolerance of the cloud computing platform and sensitive data of new enterprise data center. In o...

متن کامل

Intrusion-Resilient Middleware Design and Validation

Intrusion Tolerance has become a reference paradigm for dealing with intrusions and accidental faults, achieving security and dependability in an automatic way, much along the lines of classical fault tolerance. This chapter is an introduction to the design and validation of intrusion-tolerant middleware and systems.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014